BM432 Data Visualisation Workshop

Morgan Feeney

University of Strathclyde

Leighton Pritchard

University of Strathclyde

2024-10-22

1. Introduction

Learning Objectives

  • You should be able to critically analyse how data is visualised
  • You should be able to judge a figure’s clarity and potential for misunderstanding
  • You should be able to identify potential sources of bias resulting from the visualisation
  • You should understand how to create effective figures for your own work

2. Critique of Published Scientific Figures

Example 1

Figure 1: Small molecules identified in previous HTS increase GCase activity.

Your Critical Analysis of Example 1

Your Critical Analysis of Example 1

Reasons for the grade
clear and comprehensive data
the figures and panels were clear, everything was labelled well and easy to follow, good use of white space. the legend is comprehensive and easy to follow. A16 and A18 were labelled with distinct colors which made it easy to identify.
Clear to understand without reading the paper, nice diagrams, and figure legend.
A good colour scheme has been used and the white space has been used effectively. However, there is missing statistical analysis on some of the bars ((A18 - 107).
Overall graph is structured well. However there is a missed statistical analysis comparing A18 (107) in graph B. Graph G does not clarify what it’s trying to quantify. But the graph overall is appropriate. In terms of aesthetics, they used good colour schemes and it is easily recognisable.
The figure is well organised and structured and there is consistency with colour labelling for all the images. Good labelling of images across the figure and a good description of most things going on in every section of the figure helps the reader follow through and not get lost.
Distinct colours, good use of white space. Inclusion of both microscopy images and data quantification makes their data look more convincing.
clear bar chart; good color with gradient depths, making it more intuitive
I can understood most of the figure and the meaning of this
Because the figure and legend are scientifically accurate, clearly structured, and well explained, demonstrating a strong understanding of the experimental design and results interpretation. Minor improvements in abbreviation clarity are the only issue.
easy to understand, clear
It wasn’t very easy to understand, and I would need to read the paper in order to know what was the topic. The title wasn’t great as it told us the result and not just what the test was.

Your Critical Analysis of Example 1

Suggestions for improvement
figure size for each is different
labelling for the concentrations could have been better. The 95% CI range was not included. did not explain the meaning of ns. too many panels and describes different cells. panels D and G were only explained by “quantification” and could be clearer
Maybe the presence of A18 in F and G.
If the bar graphs are going to be high, use a higher scale for the y-axis. 1B- No statistical stars between the vehicle control and the fibroblast treated with 0.1uM of A18. 1C- No mention of DAP1 and Merge in terms of nuclear localization. 1D- Not much detail about this legend, hence not easily understood. 1F- No mention of DAP1 and Merge in terms of nuclear localization. 1G- Not much detail in legend, hence not easily understood. For the sake of those who have colour blindness, it will be good to use a brighter background colour for Merge nuclear localization in Figures 1C and 1F. The figure legend title could’ve included nuclear localization and established the connection it has with GCase activity.
Use of bar charts does not show individual data points or give a good idea of data distribution - though including the total number within the bar is helpful. - This also makes it challenging to decide if some of their statistical tests were good choices.
There is an inconsistency in their N numbers that they do not specify (and are quite vague about) - I think this is an important thing to include in the figure legend, and raises questions regarding if additional replicates were made after initially looking at the data. I think they should be transparent in their figure legend about how many replicates were used for each experiment they visualise.
Spell out terms like CI (confidence interval) or shRNA on first use if this figure appears in a standalone report.
Including what colour represented what in the legend would of helped, aswell as just making it clearer on what is being looked at. instead of just putting 72 h I would put 72hrs or the word hours to make it very clear what is being spoke about, unless previously discussed the abbreviation.

Critique 1.1

Issue

Dynamite plot: the lower extent of these error bars is not visible.

Solution

Use boxplots or 1D scatterplots

Critique 1.2

Issue

Incomplete presentation of statistical comparisons, e.g is there a difference between A16 and A18 in (B)?

Solution

Present a table of statistical differences instead, or alongside the figure.

Critique 1.3

Issue

Distance between bars makes comparison awkward.

Solution

Place things to be compared by the reader next to each other where possible, to facilitate visual comparison.

Critique 1.4

Issue

The scale on the micrographs (especially in panel C) is too small to read easily.

Solution

Increase the size of the scale relative to the figure so it can be read.

Critique 1.5

Issue

A visual control (bright-field image) is absent.

Solution

In addition to showing DAPI and immunofluorescence images of the cells, include a bright-field micrograph of the cells (no fluorescence).

Critique 1.6

Issue

The colour scheme is misleading because saturation represents different data across figure panels (compare 1 \(\mu\)M A18 in B vs 5 \(\mu\)M in D, and 1 \(\mu\)M A18 in E).

Solution

Be consistent with the visual messaging of colour (hue, saturation, and luminance).

Critique 1.7

Issue

Too many comparisons in B clutter the figure and compress the space available for showing data.

Solution

Present a table of statistical differences instead, or alongside the figure, to reduce clutter.

Critique 1.8

Issue

\(y\)-axis scales vary between panels, so quantitative comparison between panels is difficult.

Solution

Use the same \(y\)-axis scale in all panels to facilitate direct comparison.

Our Critical Analysis

Example 2

Figure 2: Endometriosis-associated macrophages exhibit significant transcriptomic heterogeneity.

Your Critical Analysis of Example 2

Your Critical Analysis of Example 2

Reasons for the grade
good colours and presentation
The panels are not organised and the figure is overwhelming and kind of hard to interpret.
It was a bit unclear and Figure B was a bit hard to read
The figures are not presented well and are difficult to understand. In Figure C, the colour scheme has not been explained. Can’t understand the figures without the figure legend.
Figure is not easily understandable without the help of legends. Horrible sizes in graphs and the use of white spaces. Graph C does not describe the colour coding scheme (what each colour mean)
Overall, each image in the figure is labelled very well and this helps the reader connect what is being said in the legend to a particular part of the figure. Various techniques have been used for different experiments and seeing all of this in a figure makes it look very artistic.
Immediately clear visualisation of methods and data. Little use of blank space, panels are unclear and some graphs don’t label axes.
contains too many elements; color combination is chaotic; some words are too small to read
I know what the figure want to tell me for 80%
This figure and legend are exceptionally clear and detailed, demonstrating an excellent understanding of single-cell transcriptomic analysis and data interpretation. The structure, labeling, and explanations are of publication quality, with no major issues identified.
little difficult to understand, clear
It was good having a description of what each of the abbreviations were. I think the layout wasn’t very good, as it didn’t flow very well and was hard to tell which part of figure was with each letter. Just think its quite hard to understand what is happening.

Your Critical Analysis of Example 2

Suggestions for improvement
looks abit overwhelming
organise the panels. add more explanations for each panel in the legend. make imaging clearer in terms of colors
clear separation of figures
The size of certain images (e.g. the ones on the right hand side of figure 2E) should be increased so that it can be more visible to the reader. Graph 2B is scattered all over the place, so no coherent order. Although Graph C has every bit of the image labelled correctly, there should only be the colour of each sample type showing the percentage of their cluster membership. In figure 2D it’s very challenging to spot five differentially expressed genes for each cluster and the image is quite fussy, making it confusing to find out what’s going on.
no suggestion
I would spread it out a bit more, add what the colours link to in the legend, and try create a clearer picture. Also D you couldn’t read the the data at the size so therefore couldn’t understand fully what was going on.

Critique 2.1

Issue

UMAP plots (B, E) are highly manipulable and clustering/placement does not necessarily reflect objective measures.

Solution

Be cautious of over-interpretation of UMAP and other nonlinear dimensionality reduction plots.

Critique 2.2

Issue

Unpleasant clashing (R default) colour choices in (C).

Solution

Use an appropriate colour palette.

Critique 2.3

Issue

The proportion plot in (C) does not give information on absolute number, only proportion/composition.

Solution

A proportional areas plot spanning all clusters would represent both absolute count per group and compositional information.

Critique 2.4

Issue

Heatmap text is too small to read comfortably.

Solution

Present heatmap as a separate figure, or reduce the amount of information in the image.

Critique 2.5

Issue

Heatmap is missing a scale (is purple high and yellow low, or vice versa?)

Solution

Add a scale bar.

Critique 2.6

Issue

The experimental summary in (A) does not indicate order of operations.

Solution

Use arrows to indicate order of steps and/or dataflow.

Critique 2.7

Issue

Text is too small in general to read comfortably.

Solution

Increase font size and/or break up the panels into individual figures.

Critique 2.8

Issue

The figure is crowded and the separation of panels is unclear.

Solution

Use whitespace to guide reader “flow” through the figure and reduce crowding -in particular, cramming (C) under the inset from (B) makes the figure feel very crowded. Alternatively break the panels into multiple figures.

Critique 2.9

Issue

The overall message of the figure is unclear.

Solution

If the intent is just that the macrophages exhibit transcriptional heterogeneity, then (D) is probably sufficient. If other messages are intended, then revise for clarity.

Our Critical Analysis

Example 3

Figure 3: A C. difficile mutant lacking all three YkuD-type Ldts (\(\Delta\)ldt1-3) exhibits wild-type growth, morphology, and 3-3 cross-linking.

Your Critical Analysis of Example 3

Your Critical Analysis of Example 3

Reasons for the grade
clear and simple to understand. not overwhelming
the legend is clear and the figure is well organised
Great representation of results simple to read
There is good colour use and good scientific communication. The figure legend is a bit long but has good detail/
Figure looks more structured. Figure legends describe results as well. Proper use of scientific communication.
This figure, indeed, tells a story; you can tell from the legend, starting with the title, correct labelling of each diagram and ideal selection of colours. Everything is magnificently organised, structured and nothing is difficult to read.
Excellent presentation of Data, easy to comprehend. The illustrated diagram contributes massively to telling a story with the data.
very clear legend
Just understood half the meaning of the figure
The figure is well-designed, with precise visual representation and a clear, detailed legend demonstrating strong understanding of experimental rationale. Minor improvements in conciseness could elevate it to outstanding level.
Clear, well-organized
Thought it was good and could understand it to an extent. I think the title is a bit long and could be more to the point. Data is presented very well.

Your Critical Analysis of Example 3

Suggestions for improvement
figure could be bigger. panel D is not quite clear. P values and annotations would make it clearer
no distinctive colors in figures
For figure 3C, it would be better to have the words ‘WT’ and ‘Idt1-3’ at the top of the images in the white space, just as the words ‘Fluorescence’ and ‘Phase Contrast’ are positioned on the left hand side of the images. For the bottom images of figure 3C it would be ideal to make sure that when doing fluorescent microscopy, all of the cells are captured and that there are none that are partly seen.
Figure B could have been placed next to figure A and then figure C underneath figure 3A and figure 3D & 3E could’ve been put next to figure 3C.
Alternative to a bar chart that shows data points would be preferable
no suggestion
Needs a better title.

Critique 3.1

Issue

Dynamite plot: the lower extent of these error bars is not visible.

Solution

Use boxplots or 1D scatterplots

Critique 3.2

Issue

No complement of the triple mutant strain - missing data for an essential control?

Solution

Ensure that all controls are presented so that the experimental result can be interpreted properly.

Critique 3.3

Issue

Label obscures part of the image.

Solution

Relocate the label so data is not obscured.

Critique 3.4

Issue

Fluorescence wavelength not specified.

Solution

Include sufficient information that the reader does not need to refer to the main text. Overall a figure legend should provide enough detail for the figure to stand alone, but should not describe results/significance.

Critique 3.5

Issue

Colour schemes/palettes are inconsistent within panel A, and between panels B, D, and E.

Solution

Be consistent with the visual messaging of colour (hue, saturation, and luminance).

Critique 3.6

Issue

Excessive length of figure legend for panel A.

Solution

Break out panel A into its own figure.

Our Critical Analysis

Example 4

Figure 4: Functional characterization and overall structure of Rv1217c–1218c.

Your Critical Analysis of Example 4

Your Critical Analysis of Example 4

Reasons for the grade
good colours and design. data is clear and easy to see and comprehend
figure is clearly organised and easy to follow. Data is clear and well visiualised, range of methods included, clear link to structure and function
Excellent figure
Figures are represented well and the figure legend is detailed. However, there is no statistical analysis.
Proper use of spaces. Figure legends are concise, not too long. However, there were no statistical analysis conducted in graph B
This figure tells a story and excellent practical techniques with appropriate colour choices were used to make this figure eye-catching. Again, just like the previous figure everything is magnificently structured, correctly labelled, readable font size and the legend provides everything I need to know about the figure, making it easily understood.
Good way to show the data clearly, protein structure contributes to storytelling.
looks good in both terms of clarity and aesthetics
not really sure am I understood the figure right
The figure achieves a near-perfect balance of experimental clarity, structural visualization, and explanatory precision. It demonstrates a publication-level mastery of both data interpretation and visual communication.
Clear, well-organized
Flowed well and the legend was very easy to follow, knew exactly what was being spoke about. The title is great, and the description of the colours is excellent.

Your Critical Analysis of Example 4

Suggestions for improvement
idk it looks amazing to me
lacks statistics in panel B. western blot panel is too small (easy to miss)
Ethambutol structure could have been included in figure 4A. Figure 4D could have been labelled as the secondary structure of Rv1217c.There could have been a figure 4E showing a secondary structure of Rv1218c. Full words for the abbreviations used in the figure legend could have been used to inform the reader of what these abbreviations mean.
no suggestion
Adding more labels at the top of C could improve it. Explain when talking about B that it is the %. There was no statistical analysis done for B, this could’ve been utilised because without this how will we know whether these results are reliable, no analysis can be done claiming any significance.

Critique 4.1

Issue

The rifampicin structure is purely decorative.

Solution

Remove unnecessary elements and avoid needless distractions in figures.

Critique 4.2

Issue

Dynamite plot: the lower extent of these error bars is not visible.

Solution

Use boxplots or 1D scatterplots

Critique 4.3

Issue

Colour scheme in (B) doesn’t seem purposeful and doesn’t add anything to the figure.

Solution

Use colour consistently to link or distinguish elements/groups/data, rather than for decoration.

Critique 4.4

Issue

The meaning of the grey regions in (C) and (D) is unclear.

Solution

The implied membrane in (C) and (D) could be labelled as such in the figure. By convention in microbiology, we would assume that the periplasm/extracellular space is “up” and the cytoplasm is “down” in (C) and (D) - but this should really be labelled to avoid any potential confusion.

Critique 4.5

Issue

Colours in (A) difficult to distinguish, especially with the red boxes which seem to skew blue closer to purple.

Solution

Instead of well images, a heatmap with clearer colour distinction could be presented.

Critique 4.6

Issue

Showing two cut-out bands is not an appropriate way to show Western blot data.

Solution

Present the complete blot.

Critique 4.7

Issue

Text in (D) is too small to read easily.

Solution

Break out panel into a separate figure or increase font size.

Our Critical Analysis

3. Summing Up

General Comments - Data Presentation Choices

Issue

Authors do not always choose the best or most appropriate way to present their data (e.g., dynamite plots)

Solution

When making data visualisation choices, consider how you can best present your data transparently to the reader so that they can evaluate your work (e.g., show the raw data).

Be aware of conventions in the field and discuss with colleagues to come to a consensus of the best and most transparent way to present your data.

General Comments - Colour Choices

Issue

Authors do not always choose the best or most appropriate colours when designing data visualisations.

Solution

When choosing colours, try to:

  1. Be consistent (always use the same colour to represent the same group or feature)

  2. Be accessible (be aware of the different types of colourblindness and design appropriately)

  3. Be aesthetically pleasing (choose colours that work well together, or choose well-designed colour palettes)

General Comments - Readability

Issue

Authors do not always make figures large enough or easy to read.

Solution

When designing figures, make sure that they are appropriately sized and accessible (font large enough to read easily at 100% size).

Include enough space between panels of a figure that allows the figures to “breathe” - the eye should flow smoothly from one element of the figure to the next in a natural way.

Try printing out a copy of the figure and look at it from arm’s length to see how it feels in terms of whitespace and use of “real estate” - then make appropriate edits to improve the figure.

General Comments - Message

Issue

Authors sometimes include too much data in a figure, or make a figure overwhelming in other ways.

Solution

When sitting down to make a figure, be sure to define a clear message in the first instance. Present the data that are necessary to convey that message. Remove anything extraneous. (Split into multiple figures if necessary.)

Some Very Insightful Comments From You

  • “The data is presented in a manner that would likely be inaccessible for people without prior experience. A move toward a more palatable/digestible format will facilitate better science communication in the future.”

  • “I think that it is also crucially important to learn how to present data well and responsibly…”

Further Reading